Abstract

This seminar explores the fundamental role of convex optimization as the mathematical backbone of modern machine learning algorithms. Convex optimization represents one of the few classes of optimization problems that can be reliably solved to global optimality, making it indispensable for machine learning applications. The presentation demonstrates why convex optimization is crucial for ML by examining its theoretical guarantees—unlike general optimization problems that are practically impossible to solve, convex problems like least-squares, linear programming, and semidefinite programming have efficient, reliable algorithms that have been proven effective since the time of Gauss and Fourier.

The seminar provides a comprehensive examination of machine learning through four critical perspectives: statistical foundations (including maximum likelihood estimation and its equivalence to minimizing KL divergence), computer science architectures (neural networks and hyperparameter optimization), numerical algorithms (stochastic gradient descent and backpropagation), and hardware acceleration using GPU parallelism. Key machine learning methods are analyzed through the convex optimization lens, including linear regression (formulated as least-squares problems), support vector machines with kernel transformations, and the mathematical foundations of neural network training via backpropagation and the chain rule.

The presentation concludes with an exploration of deep learning architectures, particularly convolutional neural networks (CNNs) for image processing and recurrent neural networks (RNNs) for sequential data, alongside their applications in emerging domains including IoT, cybersecurity, autonomous vehicles, and biomedical research. Special attention is given to how traditional rule-based systems can be enhanced through deep learning approaches, and the competitive landscape in AI development among major technology companies. This comprehensive overview demonstrates how convex optimization theory provides both the mathematical rigor and practical algorithms necessary for the current AI revolution.